AITopics | tsallis entropy

Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems

Neural Information Processing SystemsJun-18-2026, 17:30:34 GMT

Follow-the-Regularized-Leader (FTRL) policies have achieved Best-of-BothWorlds (BOBW) results in various settings through hybrid regularizers, whereas analogous results for Follow-the-Perturbed-Leader (FTPL) remain limited due to inherent analytical challenges. To advance the analytical foundations of FTPL, we revisit classical FTRL-FTPL duality for unbounded perturbations and establish BOBW results for FTPL under a broad family of asymmetric unbounded Fréchettype perturbations, including hybrid perturbations combining Gumbel-type and Fréchet-type tails. These results not only extend the BOBW results of FTPL but also offer new insights into designing alternative FTPL policies competitive with hybrid regularization approaches. Motivated by earlier observations in two-armed bandits, we further investigate the connection between the 1/2-Tsallis entropy and a Fréchet-type perturbation. Our numerical observations suggest that it corresponds to a symmetric Fréchet-type perturbation, and based on this, we establish the first BOBW guarantee for symmetric unbounded perturbations in the two-armed setting. In contrast, in general multi-armed bandits, we find an instance in which symmetric Fréchet-type perturbations violate the key condition for standard BOBW analysis, which is a problem not observed with asymmetric or nonnegative Fréchet-type perturbations. Although this example does not rule out alternative analyses achieving BOBW results, it suggests the limitations of directly applying the relationship observed in two-armed cases to the general case and thus emphasizes the need for further investigation to fully understand the behavior of FTPL in broader settings.

artificial intelligence, data mining, machine learning, (21 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

A Hybrid Tsallis-Polarization Impurity Measure for Decision Trees: Theoretical Foundations and Empirical Evaluation

Lansiaux, Edouard, Jairi, Idriss, Zgaya-Biau, Hayfa

arXiv.org Machine LearningMar-17-2026

We introduce the Integrated Tsallis Combination (ITC), a hybrid impurity measure for decision tree learning that combines normalized Tsallis entropy with an exponential polarization component. While many existing measures sacrifice theoretical soundness for computational efficiency or vice versa, ITC provides a mathematically principled framework that balances both aspects. The core innovation lies in the complementarity between Tsallis entropy's information-theoretic foundations and the polarization component's sensitivity to distributional asymmetry. We establish key theoretical properties-concavity under explicit parameter conditions, proper boundary conditions, and connections to classical measures-and provide a rigorous justification for the hybridization strategy. Through an extensive comparative evaluation on seven benchmark datasets comparing 23 impurity measures with five-fold repetition, we show that simple parametric measures (Tsallis $α=0.5$) achieve the highest average accuracy ($91.17\%$), while ITC variants yield competitive results ($88.38-89.16\%$) with strong theoretical guarantees. Statistical analysis (Friedman test: $χ^2=3.89$, $p=0.692$) reveals no significant global differences among top performers, indicating practical equivalence for many applications. ITC's value resides in its solid theoretical grounding-proven concavity under suitable conditions, flexible parameterization ($α$, $β$, $γ$), and computational efficiency $O(K)$-making it a rigorous, generalizable alternative when theoretical guarantees are paramount. We provide guidelines for measure selection based on application priorities and release an open-source implementation to foster reproducibility and further research.

artificial intelligence, impurity measure, machine learning, (16 more...)

arXiv.org Machine Learning

2603.13241

Country:

North America > United States > Wisconsin (0.04)
Europe > France > Occitanie > Hérault > Montpellier (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Maximum Causal Tsallis Entropy Imitation Learning

Kyungjae Lee, Sungjoon Choi, Songhwai Oh

Neural Information Processing SystemsMar-14-2026, 19:24:45 GMT

In this paper, we propose a novel maximum causal Tsallis entropy (MCTE) framework for imitation learning which can efficiently learn a sparse multi-modal policy distribution from demonstrations. We provide the full mathematical analysis of the proposed framework. First, the optimal solution of an MCTE problem is shown to be a sparsemax distribution, whose supporting set can be adjusted. The proposed method has advantages over a softmax distribution in that it can exclude unnecessary actions by assigning zero probability. Second, we prove that an MCTE problem is equivalent to robust Bayes estimation in the sense of the Brier score. Third, we propose a maximum causal Tsallis entropy imitation learning (MCTEIL) algorithm with a sparse mixture density network (sparse MDN) by modeling mixture weights using a sparsemax distribution. In particular, we show that the causal Tsallis entropy of an MDN encourages exploration and efficient mixture utilization while Shannon entropy is less effective.

artificial intelligence, demonstration, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

b3e866c228f8f4ea18021ae63aea5453-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 16:13:45 GMT

artificial intelligence, machine learning, regularization, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
North America > United States (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

Maximum Causal Tsallis Entropy Imitation Learning

Kyungjae Lee, Sungjoon Choi, Songhwai Oh

Neural Information Processing SystemsFeb-12-2026, 10:58:36 GMT

Neural Information Processing Systems http://nips.cc/

demonstration, entropy, imitation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)

Add feedback

Best-of-All-WorldsBoundsfor OnlineLearningwithFeedbackGraphs

Neural Information Processing SystemsFeb-11-2026, 20:03:23 GMT

Here, θ() is the clique coveringnumberofthegraph. Online learning models a repeated interaction between a learner and an environment.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Industry: Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

b3e866c228f8f4ea18021ae63aea5453-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 05:22:37 GMT

artificial intelligence, machine learning, regularization, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning

Wenhao Yang, Xiang Li, Zhihua Zhang

Neural Information Processing SystemsOct-2-2025, 14:46:58 GMT

We propose and study a general framework for regularized Markov decision processes (MDPs) where the goal is to find an optimal policy that maximizes the expected discounted total reward plus a policy regularization term.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems

Lee, Jongyeong, Honda, Junya, Ito, Shinji, Oh, Min-hwan

arXiv.org Machine LearningAug-27-2025

Follow-the-Regularized-Leader (FTRL) policies have achieved Best-of-Both-Worlds (BOBW) results in various settings through hybrid regularizers, whereas analogous results for Follow-the-Perturbed-Leader (FTPL) remain limited due to inherent analytical challenges. To advance the analytical foundations of FTPL, we revisit classical FTRL-FTPL duality for unbounded perturbations and establish BOBW results for FTPL under a broad family of asymmetric unbounded Fréchet-type perturbations, including hybrid perturbations combining Gumbel-type and Fréchet-type tails. These results not only extend the BOBW results of FTPL but also offer new insights into designing alternative FTPL policies competitive with hybrid regularization approaches. Motivated by earlier observations in two-armed bandits, we further investigate the connection between the $1/2$-Tsallis entropy and a Fréchet-type perturbation. Our numerical observations suggest that it corresponds to a symmetric Fréchet-type perturbation, and based on this, we establish the first BOBW guarantee for symmetric unbounded perturbations in the two-armed setting. In contrast, in general multi-armed bandits, we find an instance in which symmetric Fréchet-type perturbations violate the key condition for standard BOBW analysis, which is a problem not observed with asymmetric or nonnegative Fréchet-type perturbations. Although this example does not rule out alternative analyses achieving BOBW results, it suggests the limitations of directly applying the relationship observed in two-armed cases to the general case and thus emphasizes the need for further investigation to fully understand the behavior of FTPL in broader settings.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2508.18604

Country: